105 research outputs found
Origin and evolution of intercrystalline brine in the northern Qaidam Basin based on hydrochemistry and stable isotopes
The Kunteyi Basin, located in northern Qaidam, is known as a significant potash ore deposit in China. It is of great significance to study the origin of the potassium-rich intercrystalline brine to support the exploitation of potassium salts. In this study, the major ion concentrations and isotopic ratios (δ2H, δ18O, and δ11B) of intercrystalline brine were used to analyze the evolution of the brine. The results show that the intercrystalline brine has a much higher concentration of total dissolved solids compared with the oil-field brine. Most of the ions are enriched except Ca2+ and Br−. The value of δ2H and δ18O are much negative while the δ11B values are positive. The analysis of CNa/CCl, CBr/CCl, Cl/(Na + K + Mg) and isotopes ratios, indicate that (1) Atmospheric precipitation is the primary source of water in brine; (2) The salinity of the brine is mainly influenced by halite dissolution; (3) The study area was influenced by the deep hydrothermal fluids. The thermal water recharged the Pleistocene layer, reacted with polyhalite, and formed Mg- and K-rich brine. The solution rose along the channel formed by the Shuangqiquan Fault and was supplied to the shallow intercrystalline brine
Structure and Activity of a Selective Antibiofilm Peptide SK-24 Derived from the NMR Structure of Human Cathelicidin LL-37
The deployment of the innate immune system in humans is essential to protect us from infection. Human cathelicidin LL-37 is a linear host defense peptide with both antimicrobial and immune modulatory properties. Despite years of studies of numerous peptides, SK-24, corresponding to the long hydrophobic domain (residues 9–32) in the anionic lipid-bound NMR structure of LL-37, has not been investigated. This study reports the structure and activity of SK-24. Interestingly, SK-24 is entirely helical (~100%) in phosphate buffer (PBS), more than LL-37 (84%), GI-20 (75%), and GF-17 (33%), while RI-10 and 17BIPHE2 are essentially randomly coiled (helix%: 7–10%). These results imply an important role for the additional N-terminal amino acids (likely E16) of SK-24 in stabilizing the helical conformation in PBS. It is proposed herein that SK-24 contains the minimal sequence for effective oligomerization of LL-37. Superior to LL-37 and RI-10, SK-24 shows an antimicrobial activity spectrum comparable to the major antimicrobial peptides GF-17 and GI-20 by targeting bacterial membranes and forming a helical conformation. Like the engineered peptide 17BIPHE2, SK-24 has a stronger antibiofilm activity than LL-37, GI-20, and GF-17. Nevertheless, SK-24 is least hemolytic at 200 µM compared with LL-37 and its other peptides investigated herein. Combined, these results enabled us to appreciate the elegance of the long amphipathic helix SK-24 nature deploys within LL-37 for human antimicrobial defense. SK-24 may be a useful template of therapeutic potentia
Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
Self-supervised representation learning has proved to be a valuable component
for out-of-distribution (OoD) detection with only the texts of in-distribution
(ID) examples. These approaches either train a language model from scratch or
fine-tune a pre-trained language model using ID examples, and then take
perplexity as output by the language model as OoD scores. In this paper, we
analyse the complementary characteristics of both OoD detection methods and
propose a multi-level knowledge distillation approach to integrate their
strengths, while mitigating their limitations. Specifically, we use a
fine-tuned model as the teacher to teach a randomly initialized student model
on the ID examples. Besides the prediction layer distillation, we present a
similarity-based intermediate layer distillation method to facilitate the
student's awareness of the information flow inside the teacher's layers. In
this way, the derived student model gains the teacher's rich knowledge about
the ID data manifold due to pre-training, while benefiting from seeing only ID
examples during parameter learning, which promotes more distinguishable
features for OoD detection. We conduct extensive experiments over multiple
benchmark datasets, i.e., CLINC150, SST, 20 NewsGroups, and AG News; showing
that the proposed method yields new state-of-the-art performance.Comment: 11 page
CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition
Cross-lingual named entity recognition (NER) aims to train an NER system that
generalizes well to a target language by leveraging labeled data in a given
source language. Previous work alleviates the data scarcity problem by
translating source-language labeled data or performing knowledge distillation
on target-language unlabeled data. However, these methods may suffer from label
noise due to the automatic labeling process. In this paper, we propose CoLaDa,
a Collaborative Label Denoising Framework, to address this problem.
Specifically, we first explore a model-collaboration-based denoising scheme
that enables models trained on different data sources to collaboratively
denoise pseudo labels used by each other. We then present an
instance-collaboration-based strategy that considers the label consistency of
each token's neighborhood in the representation space for denoising.
Experiments on different benchmark datasets show that the proposed CoLaDa
achieves superior results compared to previous methods, especially when
generalizing to distant languages.Comment: ACL 2023. Our code is available at
https://github.com/microsoft/vert-papers/tree/master/papers/CoLaD
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
In long context scenarios, large language models (LLMs) face three main
challenges: higher computational/financial cost, longer latency, and inferior
performance. Some studies reveal that the performance of LLMs depends on both
the density and the position of the key information (question relevant) in the
input prompt. Inspired by these findings, we propose LongLLMLingua for prompt
compression towards improving LLMs' perception of the key information to
simultaneously address the three challenges. We conduct evaluation on a wide
range of long context scenarios including single-/multi-document QA, few-shot
learning, summarization, synthetic tasks, and code completion. The experimental
results show that LongLLMLingua compressed prompt can derive higher performance
with much less cost. The latency of the end-to-end system is also reduced. For
example, on NaturalQuestions benchmark, LongLLMLingua gains a performance boost
of up to 17.1% over the original prompt with ~4x fewer tokens as input to
GPT-3.5-Turbo. It can derive cost savings of \$28.5 and \$27.4 per 1,000
samples from the LongBench and ZeroScrolls benchmark, respectively.
Additionally, when compressing prompts of ~10k tokens at a compression rate of
2x-10x, LongLLMLingua can speed up the end-to-end latency by 1.4x-3.8x. Our
code is available at https://aka.ms/LLMLingua
Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources
For languages with no annotated resources, transferring knowledge from
rich-resource languages is an effective solution for named entity recognition
(NER). While all existing methods directly transfer from source-learned model
to a target language, in this paper, we propose to fine-tune the learned model
with a few similar examples given a test case, which could benefit the
prediction by leveraging the structural and semantic information conveyed in
such similar examples. To this end, we present a meta-learning algorithm to
find a good model parameter initialization that could fast adapt to the given
test case and propose to construct multiple pseudo-NER tasks for meta-training
by computing sentence similarities. To further improve the model's
generalization ability across different languages, we introduce a masking
scheme and augment the loss function with an additional maximum term during
meta-training. We conduct extensive experiments on cross-lingual named entity
recognition with minimal resources over five target languages. The results show
that our approach significantly outperforms existing state-of-the-art methods
across the board.Comment: This paper is accepted by AAAI2020. Code is available at
https://github.com/microsoft/vert-papers/tree/master/papers/Meta-Cros
The economic burden of cervical cancer from diagnosis to one year after final discharge in Henan Province, China: A retrospective case series study.
BACKGROUND: In China, the disease burden of cervical cancer remains substantial. Human papillomavirus (HPV) vaccines are expensive and not yet centrally funded. To inform immunization policy, understanding the economic burden of the disease is necessary. This study adopted a societal perspective and investigated costs and quality of life changes associated with cervical cancer from diagnosis to one year after final discharge in Henan province, China. METHODS: Inpatient records of cervical cancer patients admitted to the largest cancer hospital in Henan province between Jan. 2017 and Dec. 2018 were extracted. A telephone interview with four modules was conducted in Jun.-Jul. 2019 with a 40% random draw of patients to obtain direct non-medical costs and indirect costs associated with inpatients, costs associated with outpatient visits, and changes in quality of life status using the EQ-5D-5L instrument. Direct medical expenditures were converted to opportunity costs of care using cost-to-charge ratios obtained from hospital financial reports. For each clinical stage (IA-IV), total costs per case from diagnosis to one year after final discharge were extrapolated based on inpatient records, responses to the telephone interview, and recommendation on outpatient follow-ups by Chinese cervical cancer treatment guidelines. Loss in quality-adjusted life years was obtained using the 'under the curve' method and regression predictions. RESULTS: A total of 3,506 inpatient records from 1,323 patients were obtained. Among 541 randomly selected patients, 309 completed at least one module of the telephone interview. The average total costs per case associated with cervical cancer from diagnosis to one year after final discharge ranged from 22,888 (in 2018 US Dollar) and the quality-adjusted life years loss varied from 0.05-0.26 for IA-IV patients. CONCLUSIONS: The economic burden associated with cervical cancer is substantial in Henan province. Our study provided important baseline information for cost-effectiveness analysis of HPV immunization program in China
- …